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Abstract. Extending recent work of others, we provide effective bounds on the fam- 
ily of all elliptic curves and one-parameter families of elliptic curves modulo p (for p 
prime tending to infinity) obeying the Sato-Tate Law. We present two methods of proof. 
Both use the framework of Murty-Sinha [MS |; the first involves only knowledge of the 
moments of the Fourier coefficients of the L-functions and combinatorics, and saves a 
logarithm, while the second requires a Sato-Tate law. Our purpose is to illustrate how 
the caliber of the result depends on the error terms of the inputs and what combinatorics 
must be done. 



1. Introduction 

Recently M. Ram Murty and K. Sinha [MS| proved effective equidistribution results 
showing the eigenvalues of Hecke operators on the space S(N, k) of cusp forms of 
weight k and level N agree with the Sato-Tate distribution. Our goal here is to use their 
framework to prove similar results for families of elliptic curves. We shall do this for 
the family of all elliptic curves and for one-parameter families of elliptic curves. 

We first review notation and previous results. Let E : y 2 = x 3 + Ax + B with 
A, B E Z be an elliptic curve over Q with associated L-function 

oo 



^> = £^-n(i-^+p) , (u) 
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where A = — 16 (4A 3 + 27 B 2 ) is the discriminant of E, xo is the principal character 
modulo A, and 



By Hasse's bound we know \a E {p)\ < 2y/p, so we may write a E (p) = 2y/pcos9 E (p), 
where we may choose 9 E (p) G [0, tt]. See flSill L Sil2, STJ for more details and proofs 
of all the needed properties of elliptic curves. 

How the oe(p)'s vary is of great interest. One reason for this is that they encode 
local data (the number of solutions modulo p), and are then combined to build the L- 
function, whose properties give global information about E. For example, the Birch and 
Swinnerton-Dyer conjecture [B S-Dll IBS-D2H states the order of the group of rational 
solutions of E equals the order of vanishing of L(E, s) at the central point. While we 
are far from being able to prove this, the evidence for the conjecture is compelling, 
especially in the case of complex multiplication and rank at most 1 [Brol ICW1 [GKZ, 
IGZ[ IKolfl IKol2[ iRu I . In addition there is much suggestive numerical evidence for the 
conjecture; for example, for elliptic curves with modest geometric rank r, numerical 
approximations of the first r — 1 Taylor coefficients are consistent with these coefficients 
vanishing (see for instance the families studied in HFel[|Fe2l,|MiI31l ). 

If E has complex multiplication^ then a E (p) = for half the primes; i.e., 6 E (p) = 
tt/2. The remaining angles 9 E (p) are uniformly distributed in [0, it] (this follows from 



If E does not have complex multiplication, which is the case for most elliptic curves, 
then Sato and Tate ItTal conjectured that as we vary p, the distribution of the 9e(p)'s 
converges to 2 sin 2 9d9/n. More precisely, for any interval I C [0, tt] we have 



we call 2 sin 2 9d9/n the Sato-Tate measure, and denote it by [1st- By recent results of 
Clozel, Harris, Shepherd-Barron and Taylor HCHTl |HS-BT[ |Tay[ |, this is now known 
for all such E that have multiplicative reduction at some prime; see also HBZH for re- 
sults on the error terms when |/| is small (these results are not for an individual curve, 
but rather averaged over the family of all elliptic curves) and HB-LGGl [B-LGHTl for 
generalizations to other families of L-f unctions. 

Instead of fixing an elliptic curve and letting the prime vary, we can instead fix a 
prime p and study the distribution of 9 E {jp) as we vary E. Before describing our results, 
we briefly summarize related results in the literature concerning Sato-Tate behavior in 

lr rhis means the endomorphism ring is larger than the integers. For example, y 2 = x 3 — x has complex 
multiplication, as can be seen by sending (x, y) — > (— x, iy). Note ae(p) = if p = 3 mod 4 (this can 
be seen from the definition of as(p) as a sum of Legendre symbols, sending x —> —x). 



a E {p) 



P ~ #{(^, y) e (Z/pZ) 2 : y 2 = x 3 + Ax + B mod p} 




(1.2) 



llDeIfllHHllHe2ll ). 




(1.3) 
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families. Serre USerll considered a similar question, not for elliptic curves, but rather for 
S(N, k), the space of cusp forms of weight k on T (N). He proved that for even k with 
iV + k — > oo the eigenvalues of the normalized p th Hecke operators are equidistributed 
in [—2, 2] with respect to the measure 

P+l V 1 - ^7 4 dx 

th n ( p l/2 +p -l/2)2_ a .2' W 

changing variables by setting x = 2 cos 9 this is equivalent to the measure Jl p on [0, tt] 
given by 

2(p+l) sin 2 ^ 

^ ~ 7T (p 1 /2 +p -l/2)2_ 4cos 2^- ( ^ 

Note that as p — >■ oo, /i p — > /igx! for p large these two measures assign almost the same 
probability to an interval /, differing by 0(l/p). See HCDFllSarl for other families with 
a similar distribution. 

Serre's theorem was ineffective, and has recently been improved by M. R. Murty and 
K. Sinha [MSJ. They show that if {an(p)/p^ _1 ^ 2 }i<i<#S(JV,A;) denote the normalized 
eigenvalues of the Hecke operator T p on S(N, k), then 

#{1 < n < N : aM/p^^ G /} f ( \ogp \ 



#S(N,k) Ji p \\ogkN 

where #S(N, k) is the number of cusp forms of weight k and level N, and if N > 61 
then by Corollary 15 of [MSJ we have 

*m <mii , k)< m +1 , (1.7, 

200 - tt v > ; - 12 -r , v ; 

where ^(^) = -^ITpiiv + p)- This effective version of equidistribution allows 
Murty and Sinha to derive many results, such as 

• an effectively computable constant B d such that if J$(N) (the Jacobian of the 
modular curve X (N)) is isogenous to a product of Q-simple abelian varieties 
of dimensions at most d, then N < Bd', 

• the multiplicity of any given eigenvalue of the Hecke operators is s( ^^ gp . 

The purpose of this paper is to expand the techniques in HMSH to families of elliptic 
curves. Unlike HMSUSerll . we cannot keep the prime fixed throughout the argument, as 
there are only finitely many distinct reductions of elliptic curves modulo p. Instead we 
fix a prime and study the angles 0g(p) for one of the two families below, and then send 
p — > oo. We study 

(1) The family of all elliptic curves modulo p for p > 5. We may write these curves 
in Weierstrass form as y 2 = x 3 — ax — b with a,b E Z/pZ and 4a 3 ^ 27b 2 . The 
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number of pairs (a, b) satisfying these conditions^ is p(p — 1). 

(2) One-parameter families over Q(T): let A(T),B(T) E Z[T] and consider the 
family y 2 = x 3 + A(T)x + B(T) with non-constant j(T)U We specialize T 
to be a t E Z/pZ. The cardinality of the family is p + Oa,b(1) (we lose a 
few values when we specialize as we require the reduced curves to be elliptic 
curves modulo p), where the error is a function of the discriminant of the family. 

Notations: 

• We let T v denote either family, and write V p for its cardinality (which is p(p — 1) 
in the first case and p + 0(1) in the second). 

• While we may denote the angles by 9e{p), 9 a ,b(p) or 9 t (p), as p is fixed for mu- 
tational convenience and to unify the presentation we shall denote these by 9 n , 
with 1 < n < V p . 



We let e(x) = e 



1-KlX 



Normalizations: 

• For the family of all elliptic curves, we may match the elliptic curves in pairs 
(E, E') such that 9 E /(p) = n — 9e(p) (and each curve is in exactly one pair); 
see Remark [LTI for a proof. Thus, if we let x n = 9 n (p)/ir, we see that the set 
{2x n } n < Vp is symmetric about it. This will be very important later, as it means 
Y^ n <v v sin(27rma; ri ) = for any integer m. 

• For a one-parameter family of elliptic curves, in general we cannot match the el- 
liptic curves in pairs, and thus the set {29 t (p)} is not typically symmetric about 
7r; see Remark [L2l for some results about biases in the 9 t (p)'s. This leads to 
some complications in proving equidistribution, as certain sine terms no longer 
vanish. To overcome this, following other researchers we consider the techni- 
cally easier situation where for each elliptic curve we include both 9 t (p) and 
2n — 9 t (p). To unify the presentation, instead of normalizing these angles by 
dividing by 2n (to obtain a distribution supported on [0, 1]), we first study the 
angles modulo n and then divide by n. We thus consider the normalized angles 
Xt = 9t(p)/ir and x t +v p = 1 _ 9 t (p)/n for 1 < t < V p . Thus we study 2V P 

2 If a — then the only b which is eliminated is b = 0. If a is a non-zero perfect square there are two 
b that fail, while if a is not a square than no b fail. Thus the number of bad pairs of (a, b) is p. 
3 Up to constants, j{T) is A(Tf/{AA{Tf + 27B(T) 2 ). 
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normalized angles in [0,1], unlike the case of all elliptic curves where we had 
V p angles. 

• We set V p = V p for the family of all elliptic curves, and 2 V p for a one-parameter 
family of elliptic curves. We study the distribution of the normalized angles 

{ x n}i< n <v p - 



Remark 1.1. To see that we may match the angles as claimed for the family of all 
elliptic curves, consider the elliptic curve y 2 = x 3 — ax — b with 4a 3 ^ 27b 2 . Let c be 
any non-residue modulo p, and consider the curve y 2 = x 3 — ac 2 x — be 3 . Using the 
Legendre sum expressions for ae(p) and a>E>(p), using the automorphism x — >■ cx we 
see the second equals (£) times the first; as we have chosen c to be a non-residue, this 
means 2^fpcos(6 E i(p)) = — 2 v /pcos(6 ) £;(p)), orOE'ip) = tt — 6 E {p) as claimed. 

Remark 1.2. If the one-parameter family of elliptic curves has rank r over Q(T) and 
satisfies Tate's conjecture (see flal IRS II ), then Rosen and Silverman [RS] prove a con- 
jecture ofNagao UNai which states 

lim _ly^(p)^P = r (18) 

where Ai(p) := J2 tmodp at{p)- Tate's conjecture is known for rational surfaces^ This 
bias has been used by S. Arms, A. Lozano-Robledo and S. J. Miller [AL-RMJ to con- 
struct one-parameter families with moderate rank by finding families where A(p) is 
essentially —rp. As there are about p curves modulo p, this represents a bias of about 
—r on average per curve; as each a t (p) is of order ^fp, we see in the limit that this bias 
should be quite small per curve ( though significant enough to lead to rank, it gives a 
lower order contribution to the distribution for each prime, and will be dwarfed by our 
other errors ). 

Our goal is to prove effective theorems on the rate of convergence as p — > oo to the 
Sato-Tate measure, which requires us to obtain effective estimates for 

#{n<Vp:d n eI}-(i ST (I)Vp\. (1.9) 

Here p S T is the Sato-Tate measure on [0, n] given by 

p ST (T) = f-sm 2 tdt /c[0,tt], (1.10) 
J i ^ 



4 An elliptic surface y 2 — x 3 + A(T)x + B(T) is rational if and only if one of the following is true: 

(1) < max{3degA, 2degB} < 12; (2) 3degA = 2degB = 12 and ord t=0 t 12 A(^ 1 ) = 0. 
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and for n < V p , 2y/pcos(8 n ) is the number of solutions modulo p of the elliptic curve 
E n : y 2 = x 3 + a n x + b n . Equivalently, using the normalization x n = 9 n /n to obtain a 
distribution on [0, 1], the Sato-Tate measure become 

/i st (I) = ^2sin 2 (vrx)c/x, lc[0,l]. (1.11) 

For a sequence of numbers x n modulo 1, a measure fi and an interval I C [0,1], let 

Ni(V p ) = #{n<V p :x n eI} 

= / n(t)dt. (1.12) 



The discrepancy D l v (//) is 



(1.13) 



with this normalization, the goal is to obtain the best possible estimate for how rapidly 
Dj v (ji) /V p tends to 0. 

Previous work has obtained a power savings in convergence to Sato-Tate for two- 
parameter families of elliptic curves (such as the entire family of all elliptic curves, or 
parametrizations such as y 2 = x 3 + f(a)x + g(b) with a and b varying in appropriate 
ranges); see the papers by Banks and Shparlinski HBSl IShll |Sh2| for saving in 
Sato-Tate convergence. The key step in these arguments is 

1 E BinP + ljMri) k = 

4a3+27i) 2 ^0 mod p 

see Theorem 13.5.3 from UKal for a proof. One can obtain new and similar results 
for one-parameter families of elliptic curves by appealing to a result of Michel [MicJ, 
which we do in HI Our main results are the following. 

Theorem 1.3 (Family of all curves). For the family of all elliptic curves modulo p, as 
p — > oo we have 

Djy^st) < C-\ (1.15) 
log V p 

for some computable C. Note that in this family, V P = V P and for each curve we include 
one normalized angle, x n = n /ir G [0, 1]. 

Theorem 1.4 (One-parameter family of elliptic curves). For a one-parameter family of 
elliptic curves over Q(T) with non-constant j -invariant, we have 

DjftO*) < CV p 3 " (1.16) 

for some computable C. Note that in this family, V p = 2V P and for each curve we 
include two normalized angles, x n = 9 n /ir and x n+ y p = 1 — n /ii, with 8 n G [0, ir]. 



EFFECTIVE EQUIDISTRIBUTION AND SATO-TATE FOR ELLIPTIC CURVES 



7 



Stronger results than Theorem 11.31 are known; as remarked above, convergence to 
Sato-Tate with an error of size Vp /4 instead of V p / log V p is obtained in BB51I5ET1 15H2I . 
We present these weaker arguments to highlight how one may attack these problems 
possessing only knowledge of the moments, and not the functions of the angles, in 
the hope that these arguments might be of use to other researchers attacking similar 
questions where we only have formulas for the moments of the coefficients. We will 
thus illustrate the effectiveness (in both senses of the word) of the techniques in HMSL 
as well as illustrate the loss of information that comes from having to trivially bound 
certain combinatorial sums. As we have not found similar effective results in the liter- 
ature for one-parameter families, in order to get the best possible results we do not use 
formulas for the moments but rather estimates for the analogue of (11.141) . It is worth re- 
marking that we can recover the results of IBS~l lShl[|Sh2H by our generalization of [MS J 
provided we also use (11.141) (see [Ka|) instead of results from Birch |Bi] on moments; 
this shows the value of the formulation in HMSH . 

We summarize the key ingredients of the proofs, and discuss why the second result 
has a much better error term than the first. Similar to [MS|, both theorems follow 
from an analysis of Yli n <v p e ( mx n) (we use x n = 9 n /^ in order to have a distribution 
supported on [0, 1]). For the family of all elliptic curves, after some algebra we see this 
is equivalent to understanding J2 n <v cos (2m# n ); using a combinatorial identity (see 
HMil4IO this is equivalent to a linear combination of sums of the form ^ n< y p (cos 6 n ) 2r . 
These sums are essentially the 2r th moments of the Fourier coefficients of the family of 
all elliptic curves modulo p. Birch ABU evaluated these, and showed the answers are the 
Catalan numbers^ plus lower order terms. Our equidistribution result then follows from 
a combinatorial identity of a sum of weighted Catalan numbers; our error term is poor 
due to the necessity of losing cancelation in bounding the contribution from the sums 
of the error terms. 

The proof of Theorem 11.41 is easier, as now instead of inputting results on the mo- 
ments we instead use a result of Michel UMicll for the sum over the family of sym fc (0 n ) = 
sin((fc + l)0 n )/ sin6> n . This is easily related to our quantity of interest, cos(2m# n ), 
through identities of Chebyshev polynomials: 

1 1 

cos(2m#„) = -sym 2m (# n ) - -sym 2m _ 2 (# n ). (1.17) 

The advantage of having a formula for the quantity we want and not a related quantity 
is that we avoid trivially estimating the errors in the combinatorial sums. These cal- 
culations increased the size of the error significantly, and this is why Theorem 11.41 is 
stronger than Theorem II .31 though the error term in Theorem 1 1.31 is comparable to the 



The Catalan numbers are the moments of the semi-circle distribution, which is related to the Sato-Tate 
distribution through a simple change of variables. 
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error terms of the equivalent quantities in HMSI for the family of cuspidal newforms. 
Michel proves his result by using a cohomological interpretation, and this results in the 
error term being p^ 1 ^ 2 smaller than the main term; it is this savings in the quantity we 
are directly interested in that leads to the superior error estimates. 

The paper is organized as follows. After reviewing the needed results from Murty- 
Sinha [MSJ in §[2l we prove Theorem ll.3l in §[3]and Theorem ll.4l in §@J For completeness 
the needed combinatorial identities are proved in Appendix [A], and in Appendix [B] we 
correct some errors in explicit formulas for moments in Birch's paper llBTII (where he 
neglected to mention that his sums are normalized by dividing by p — 1). 



2. Effective Equidistribution Preliminaries 



We quickly review some needed results from Murty-Sinha [MS|; while our setting 
is similar to the problems they investigated, there are slight differences which require 
generalizations of some of their results. Assume ji = F(—x)dx with 



F(x) = c m e{mx) 

m=—oo 

where e(z) = exp(2iriz). Theorem 8 from [MS] is 



(2.1) 



Theorem 2.1. Let {x n } be a sequence of real numbers in [0, 1] and let the notation be 
as above. Assume for each m that 



Jim — 

P n<V v 



e(mx r 



and 



E 



< oo. 



(2.2) 



Let | \n\ 



sup^gjQ ^ 1-^(^)1 with jj, = F(—x)dx. Then the discrepancy satisfies 
V p\\fA\ 



d i,v,(p) ^ 

+ E 

Km<M 



M + 1 
1 



M+l 



+ min b 



7T \m\ 



E 



n=l 



eimxr 



(2.3) 



for any natural numbers V p and M. 

Unfortunately, Theorem 12. II is not directly applicable in our case. The reason is that 
there we have a limit as V p — > oo in the definition of the c m , where for us we fix a prime 
p and have V p = p(p—l) for the family of all elliptic curves curves modulo p, or p+0(l) 
for a one-parameter family. Analyzing the proof of Theorem 8 from [MS], however, we 
see that the claim holds for any sequence c m (obviously if V^ 1 J2 n <v e ( mx n) is not 
close to c m then the discrepancy is large). We thus obtain 
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Theorem 2.2. Let {x n } be a sequence of real numbers in [0, 1] and let the notation be 
as above. Let {c m } be a sequence of numbers such that Xlm=-oo l c ™l < 00 ( we w ^ 
take c = 1, c±i = — 1/2 and all other c m 's equal to zero). Let \\p\\ = sup^Q ^ \F(x) \ 
with \i = F(—x)dx. Then the discrepancy satisfies 



c ;,i?>) < 

+ E 

Km<M 



M + l 
1 



M + l 



+ min [b — a 



7r \m\ 



v P 

E 

n=l 



e(mx r 



- V c 



(2.4) 



for any natural numbers V p and M. 

To simplify applying the results from HMSL we study the normalized angles x n . Un- 
der our normalization, the Sato-Tate measure becomes 



^2sm 2 (7rx)dx, IC[0, 1]. 



(2.5) 



The Fourier coefficients of /i st are readily calculated. 



Lemma 2.3. Let /i st = F(—x)dx be the normalized Sato-Tate distribution on [0, 1] with 
density 2 sin 2 (tix). We have 



F(x) = l--(e(x)+e(-x)), 

which implies that the Fourier coefficients are cq = 1, c±i 
\m\ > 2. 



-1/2 and c n 



(2.6) 
Ofar 



Proof. The proof is immediate from the expansion of F as a sum of exponentials, which 
follows from the identities cos(2#) = 1-2 sin 2 (#) and e(9) = cos(2tc9) + i sin(27r#). 

□ 

3. Proof of Effective Equidistribution for All Curves 



We use Birch's [BiJ results on the moments of the family of all elliptic curves modulo 
p (there are some typos in his explicit formulas; we correct these in Appendix|B]); unfor- 
tunately, these are results for quantities such as (2^ cos 9 n ) 2R , and the quantity which 
naturally arises in our investigation is e(mx n ) (with x n running over the normalized 
angles 9 a ,b(p)/^), specifically 



v v 



22 e(mx n ) - VpC m 



n=l 



(3.1) 



By applying some combinatorial identities we are able to rewrite our sum in terms of 
the moments, which allows us to use Birch's results. The point of this section is not to 
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obtain the best possible error term (which following I1BSI IShlL ISh2H could be obtained 
by replacing Birch's bounds with (11.141) ) but rather to highlight how one may generalize 
and apply the framework from HMSL 

We first set some notation. Let <Tk(T p ) denote the trace of the Hecke operator T p 
acting on the space of cusp forms of dimension —2k on the full modular group. We 
have <7fc + i (T p ) = 0(p k+c+e ), where from USelH we see we may take c = 3/4 (there 
is no need to use the optimal c, as our final result, namely (13.171) . will yield the same 
order of magnitude result for c = 3/4 or c = 0). Let M. P {2R) denote the 2i? th moment 
of 2 cos(6' n ) = 2 cos(7rx„) (as we are concerned with the normalized values, we use 
slightly different notation than in llBUl ): 



v p 

M P (2R) = ^^(2cos(vrx n )) 2H . (3.2) 



V Pn=1 

Lemma 3.1 (Birch). Notation as above, we have 

we may take c = 3/4 and thus there is a power saving^ 

Proof. The result follows from dividing the equation for S* R (p) on the bottom of page 59 
of [[Bill by p R , as we are looking at the moments of the normalized Fourier coefficients 
of the elliptic curves, and then using the bound o k+ i(T p ) = 0(p k+c+e ), with c = 3/4 

admissible by USelH . Recall V p = p(p — 1) is the cardinality of the family. We have 

'2R\ p(p - 1) 



M P {2R) 




R V 

R 



2k +1 ( 2R \ p l+c+e ^ p 



^ R+k + l\R + kJ V p p R V % 



+ 0(2 2R V P 2 ) (3.4) 



k=l v / "p 

2R 

since V p = p(p — 1). □ 

A simple argument (see Remark [TTTI) shows that the normalized angles are symmetric 
about 1/2. This implies 

Vp v v v p v v 

^e(mi„) = cos(27rmx n ) + i sin(27rmx n ) = cos(2m8 n ), (3.5) 

n=l n=l n=l n=l 



6 Note -^i-j- ( 2 ^) is the R lh Catalan number. The Catalan numbers are the moments of the semi-circle 
distribution, which is related to the Sato-Tate distribution by a simple change of variables. 
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where the sine piece does not contribute as the angles are symmetric about 1/2, and we 
are denoting the V p non-normalized angles by 6 n . 
Thus it suffices to show we have a power saving in 



yj cos(2m0„ 



n=l 



(3.6) 



By symmetry, it suffices to consider to > 0. 

1, c±i = — 1/2 and c n 



Lemma 3.2. Let c 

that 



otherwise. There is some c < 1 such 



cos(2m9 r , 

n=l 



VpC m 



< (m 2 2 3m K 



(3.7) 



Zry work ofSelberg USelH we may take c = 3/4. 



Proof. The case to 

cos(2# n ) = 2cos 2 (#, 



is trivial. 

- 1. As c±i = 



For m = 1 we use the trigonometric identity 
— 1/2 we have 



£coB(20„)-f = ^ 



n=l 



(2 COS 2 n - 1) + 



COS0 r 



1) 



n=l 



It, 



n=l 



P 



- 1 



(3.8) 



Note the sum of (2 v /p cos ^ n ) 2 is the second moment of the number of solutions modulo 
p. From [BiJ we have that this is p + 0(1); the explicit formula given in (|Bi| for the 
second moment is wrong; see Appendix |B] for the correct statement. Substituting yields 



Vn 



^cos(20 n )-f 



n=l 



« 0(1). 



(3.9) 



The proof is completed by showing that Yln=i cos(2m6 n ) = O m (Vp^ 2 ) provided 
2 < m < M. In order to obtain the best possible results, it is important to understand 
the implied constants, as M will have to grow with V p (which is of size p 2 ). While it is 
possible to analyze this sum for any m by brute force, we must have M growing with 
p, and thus we need an argument that works in general. As c±\ ^ but c m = for 
| to | > 2, we expect (and we will see) that the argument below does break down when 
ItoI = 1. 
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There are many possible combinatorial identities we can use to express cos(2m# n ) 
in terms of powers of cos(6 l n ). We use the following (for a proof, see Definition 2 and 
equation (3.1) of EH): 

m 

2cos(2m0 n ) = ^c 2m , 2r (2cos#„) 2r , (3-10) 

r=0 

where c 2r = (2r)!/2, c 0i o = 0, c 2m)0 = (— l) m 2 for m > 1, and for 1 < r < m set 



n<- 2 - / 2 ) = ( - ir+rm - (m+r : 1)! (3.1D 



C2m,2r 



c 2 r " c 2r (m-rj 



We now sum (13.101) over n and divide by V p , the cardinality of the family. In the 
argument below, at one point we replace 2 2r in an error term with 2012^- ( 2r ) • m 2 ; this 

allows us to pull the r th Catalan number, ^j-j- ( 2r ) , out of the error term[j Using Lemma 
I3.1l we find 

\2r 



— 2 cos(2mfl ra ) = c 2m ,2r — ^ (2 cos 6 Ti 



n=l r=0 n I 



C 2m ,2r 



E 



. r + 1 \ r 

— / 1 (2r)\ (-l) m+r 2 m-(m + r)! 
r+1 r!r! (2r)! (m — r)!-(m + r) 



r=0 

'j , r\ ( ^ 



r=0 v 

2 



r=0 v 
>-,2 





2 




l)r 








l-c-e 


V p 


2 








l-c-e 


v p 


2 



ml (m + r)\ 



(m — r)\ m\r\ (r + l)(yn + r) 



m\ ( m + r 



r / (r + l)(m + r) 



(3.12) 

We first bound the error term. For our range of r, ( m ^ r ) < ( 2 ™) < 2 2m . The sum of 
(™) over r is 2 m , and we get to divide by at least m + r > m. Thus the error term is 



The reason this is valid is that the largest binomial coefficient is the middle (or the middle two when 
the upper argument is odd). Thus 2 2r = (1 + l) 2r < (2r + 1) ( 2 r r ) < 2(m + 1) ( 2 , r ) (as m < r), and the 
claim follows from 20 r 12 ™ 2 > 2(m + 1) for m > 2 and < r < m. 
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bounded by 



0(m 2 2 3m V p 2 ). (3.13) 



We now turn to the main term. It it just (— l) m 2m times the sum in Lemma |A31 which 
is shown in that lemma to equal for any \m\ > 2. □ 

Remark 3.3. Without Lemma \A.3\ our combinatorial expansion would be useless. We 
thus give several proofs in the appendix ( including a brute force, hyper geometric and 
an application ofZeilberger's Fast Algorithm). 

Remark 3.4. It is possible to get a better estimate for the error term by a more de- 
tailed analysis of J2 r <m (™) ( m+r ) >' however, the improved estimates only change the 
constants in the discrepancy estimates, and not the savings. This is because this sum is 
at least as large as the term when r pa m/2, and this term contributes something of the 
order 3 3m / 2 /m by Stirling's formula. We will see that any error term of size 2> am for a 
fixed a gives roughly the same value for the best cutoff choice for M, differing only by 
constants. Thus we do not bother giving a more detailed analysis to optimize the error 
here. 

We now prove the first of our two main theorems. 
Proof of Theorem \L3\ We must determine the optimal M to use in (12.41 ): 



D >*M « MTT+ £ (mVt + ^)K 2 ^ 

Km<M v 



« Y?L + M2 3M V P 



v 



(3.14) 

as M~^i ^ ^ an d Yli m <m ^ 3m ^ 2 3M . For all c > we find the minimum error by 
setting the two terms equal to each other, which yields 

VjT*^ = M 2 2 3M < e 3M , (3.15) 

which when equating yields^ 

e 3M pa e '^ l °^, (3.16) 

which implies 



3 — c — e 
~6 

We thus see that we may find a constant C such that 



M pa \ogV p . (3.17) 



Djyjj*) < C-\. (3.18) 
log V p 



We could obtain a slightly better constant below with a little more work; however, as it will not affect 
the quality of our result we prefer to give the simpler argument with a slightly worse constant. 
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□ 



4. Proof of Effective Equidistribution for One-parameter families 



Instead of studying the family of all elliptic curves, we can also investigate one- 
parameter families over Q(T). Thus, consider the family S : y 2 = x 3 + A[T)x + B(T), 
where A(T) and B(T) are in Z(T). We assume that j(T) is not constant for the family. 
Michel [MicJ proved a Sato-Tate law for such families. In particular, he proved 



Theorem 4.1 (Michel [MicJ). Consider a one-parameter family of elliptic curves over 
Q(T) with non-constant j -invariant. Let ca denote the number of complex zeros of 
A(z) = (where A is the discriminant), ip p an additive character (and set 5^ v = if 
this character is trivial and 1 otherwise), and write a t;p as 2 v /p cos 6 t # with 8 tjP G [0, ir]. 
Let 

, m sin((A; + l)0) 
sym fc (#) = — u . - ; . (4.1) 



Then 



P 



t mod p 
A(t)^0 



< 



sin^ 



(A; + l)(c A -^ - 1) 



Additionally, we have 



cos 9 



mod p 
A(i)^0 



< 



y/P 



c 
Vp 



(4.2) 



(4.3) 



for some C depending on the family. Finally, we may drop the additive character and 
drop the restriction that A(t) ^ at the cost of a bounded number of summands, each 
of which is at most (k+ l)Jj which implies these relations still hold provided we multiply 
the bounds on the right hand side by some constant C. 



Remark 4.2. Miller [Mil2J showed that the error term in Theorem \4.1\ is sharp. Specif- 
ically, the second moment of the family y 2 = x 3 + Tx 2 + 1 of elliptic curves over Q(T) 
for p > 2 is 

A 2 (p) := J2 a t(P) 2 = P 2 ~ ^3,2, P P ~ I + P Yl +1 \ ( 4 - 4 ) 

x mod p ^ 



t mod p 



V 



where n 3 ^, P denotes the number of cube roots of 2 modulo p. For any [a, b] C [—2, 2] 
there are infinitely many primes p = 1 mod 3 such that 



MP) - (P 2 - n 3 ,2, P P ~ 1) e [a-p 3/2 ,b-p 3/2 ]. 



(4.5) 



9 This is readily seen by writing sin((fc + 1)6*) = sin(0) cos(fc6') + cos(#) s'm(kd) and proceeding by 
induction. 
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Theorem l4.1l is used by Michel to obtain good estimates for the average rank in these 
families, as well as (of course) proving Sato-Tate laws. Using our techniques above, we 
can convert Michel's bounds to a quantified equidistribution law. 

We recall the notation for Theorem 1 1.41 Consider a one-parameter family of elliptic 
curves over Q(T) with non-constant j(T). Let there be V p = p + 0(1) reduced curves 
modulo p, and set V p = 2V P . For each curve E t consider the angles 9t, p and n — 9 t , p , 



with 6>i iP G [0, 1], and the normalized angles x n = 9 tiP /Tc and x n+Vp 
l<n<V p ). 



1 - t , p /n (for 



Proof of Theorem [L4\ We must show D I v (fi 
proof of Theorem 1 1.31 it suffices to show 



st; 



< V p /4 (where V p « 2p). As in the 



^ cos(2m9 tjP ) - c m p 



t mod p 



(4.6) 



with Co = 1, c\ = —1/2 and all other c m = 0. This is because we have enlarged our 
set of normalized angles to be symmetric about 1/2. Thus when we study e(mx n ) = 
cos(2nmx n ) + ism(2irmx n ), the sine sum vanishes. We are therefore left with the 
cosine sum, with the normalized angles x n and x n+ v p contributing equally. Thus we 
may replace the sum of the cosine piece over n with a sum over the angles 9 tiP , so long 
as we remember to multiply by 2 when computing the discrepancy later. While we 
should subtract c m V p and not c m p, as V p = p + 0(1) the error in doing this is dwarfed 
by the error of the piece we are studying. 

The case of 2m = is trivial. If 2m = 2, then we are studying cos26 , j iP = — | + 
|sym 2 (0). By Theorem 14. 1[ we thus find that 



cos(2^) + | 

t mod p 



t mod p 



< 



c_ 
Vp' 



(4.7) 



For higher m, we use Chebyshev polynomials (see HWill ). The Chebyshev polynomials 
of the first kind are given by T^(cos^) = cos(i9); the Chebyshev polynomials of the 
second kind are Ue(cos9) = sym m (#). These polynomials are related by 



T/fcos ( 



UAcos9) — £/£_2(cos 



syr%(#) - sym^_ 2 (^ 



2 2 
we use this with I = 2m > 4. Using Theorem |4~T1 we see that for m > 2, 



(4.8) 



cos(2m9 tiP ) 

t mod p 



< Cm^/p. 



(4.9) 
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From (12.41) . the discrepancy satisfies 



+ E 

Km<M 



< 



M + 



M + l 



+ min ( b — a, 



7r m 



t=l 



(4.10) 



Using our bounds, we have 



M + l 



M 

E 

m=l 



m 



(4.11) 



The two error terms are of the same order of magnitude when M 2 = y/p, or M = p 



1/4 



This leads to 



3/4 



(4.12) 



which should be compared to a discrepancy of order p; in other words, we have a 
power savings (much better than the logarithmic savings in the family of all elliptic 
curves). □ 

Remark 4.3. Note we could have used the Chebyshev identities to handle the m = 1 
case as well, as in fact we implicitly did when we rewrote cos 29; we prefer to break the 
analysis into two cases as the m = 1 case has c m ^ 0. 

Remark 4.4. Rosen and Silverman URSI proved a conjecture of Nagao UNal relating 
the distribution of the a^ip) 's and the rank. Unfortunately the known lower order term 
due to the rank of the family is of size p 1 ^ 2 , which is significantly smaller than the error 
terms of size p 3 / 4 analyzed above. As noted in Remark \4~2\ the error term is sharp and 
cannot be improved for all families. 



Appendix A. Combinatorial Identities 

We first state some needed properties of the binomial coefficients. For n, r non- 
negative integers we set (™) = k ^[ k y ■ We generalize to real n and k a positive integer 
by setting 

/ n\ n(n — 1) ■ ■ ■ in — (k — 1)) 

1 v ^, (A.l) 



kj k\ 

which clearly agrees with our original definition for n a positive integer. Finally, we set 
(™) = 1 and (™) = if k is a negative integer. 
To prove our main result we need the following two lemmas; we follow the proofs in 
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Lemma A.l (Vandermonde's Convolution Lemma). Let r, s be any two real numbers 
and k,m,n integers. Then 



< J \ m —I— k" I \ 71 — h \ m -4- n I 



r \ I s \ r + s 

m + k J \n — k J \m + n 

Proof. It suffices to prove the claim when r, s are integers. The reason is that both sides 
are polynomials, and if the polynomials agree for an infinitude of integers then they 
must be identical. It suffices to consider the special case m = 0, in which case we are 
reduced to showing 

W s \ fr + s\ (A3) 



k J \n — k J \ n 
Consider the polynomial 

(x + yY(x + y y = (x + y) r+s . (A.4) 

If we use the binomial theorem to expand the left hand side of (IA.4I) . we get the coeffi- 
cient of the x n y r+s ~ n is the left hand side of (IA.3I) . while if we use the binomial theorem 
to find the coefficient of x n y r+s ~ n on the right hand side of (IA.4I) we get (IA.3I) . which 
completes the proof. □ 

Lemma A.2. Let £, m, s be non-negative integers. Then 

Proof. Using© = ( a %) , we rewrite ( s+ n k ) as J , and we then rewrite ( g *+*J as 
(— l) s+fe ~™( s+ ™ U by using the extension of the binomial coefficient, where we have 
pulled out all the negative signs in the numerators. The advantage of this simplification 
is that the summation index is now only in the denominator; further, the power of —1 is 
now independent of k. Factoring out the sign, our quantity is equivalent to 

E, ' \ / —n — 1 

k 



m + k J \s + k — n 



where we again use (f) = ( * 6 ). By Vandermonde's Convolution, this equals (— l) s ~ n 

(tlZ'n+s)- Usin S (t-mZ+s) = (7-7) and collecting powers of -1 completes the proof 
(note (-iy- m = (-l) f+m ). □ 

Lemma A.3. Let m be an integer greater than or equal to 1. Then 

Vf lY ( m \( m + r \ 1 = /V2 ifm=l 

^ y \r)\ r J(r + l)(m + r) |0 ifm>2. 



r=0 
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Proof. The case m = 1 follows by direct evaluation. Consider now m > 2. We have 



\r^/ sr fm\fm + r\ I 
Om — 2 A 



r=0 



r=0 
m 

r=0 
m 



r / V r / (r + l)(m + r) 
ra\ m + 1 /m + r\ 1 



r/m+l\ t J (r + l)(m + r) 
m!(m + l) 1 (m + r)(m + r — 1)! 1 



(r + 1) • r!m! m + 1 r!m • (m — 1 + r)! m + r 
m + l\/m — l+r\ 1 



r=0 

in 



r + 1 / V r / m(m + 1 



L_y ( -ir( m+1 )f m - 1+r ). (a.8) 



mm , . , 

v ' r=0 

We change variables and set u = r + 1; as r runs from to m, w runs from 1 to m + 1. 
To have a complete sum, we want it to start at 0; thus we add in the u = term, which 
is ( m Ij) . As m > 2, this is from the extension of the binomial coefficient (this is the 
first of two places where we use m > 2). Our sum S m thus equals 

s m = -^^"f ( -irf m+1 )f m - 2 ;"). (a.9) 

m(m + 1) ^ \ u J \ m — 1 / 

We now use Lemma IAT21 with k — u, m — 0, £ — m+1, s = m— 2 and n = m— 1; note 
the conditions of that lemma require s to be a non-negative integer, which translates to 
our m > 2. We thus find 

S m = -^—T,(-ir +1 ( m ~ 2 ) = 0, (A.10) 
m(m + 1) \ — 2 / 

which completes the proof. □ 

We give another proof of Lemma IA31 below using hypergeometric functions; we 
thank Frederick Strauch for showing us this approach. 

Remark A.4. We present an alternative proof of Lemma \A3\ using the hypergeometric 
function 

r(c) f 1 t^ii-ty-^dt 



2FlM ' c;z) = r(6)r( c -6) h (i - tzY ■ (A - U) 

The following identity for the normalization constant of the Beta function is crucial in 
the expansions: 

fl (x,„) = J fV'(l- t r'<« = Eg^. (A.12) 
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We can use the geometric series formula to expand (IA.1 II) as a power series in z in- 
volving Gamma factors. Rewriting (™) as (— l) r ( r_ ™ -1 ), after some algebra we find 

n _ T(m) 2 F 1 {-m,m,2; 1) T(m) 

r(2)r(l + m) T(l + m)T(2 + m)T(2-m) 1 ; 

(our summation over r in the definition of S m has become the series expansion of 
2 Fx(— m, m, 2; 1) ), where the last step uses 

, , , r(c)T(c — a — b) 

2 F l (a,b,c;l) = } 1 ^ (A. 14) 

l(c — a)L(c — o) 

which follows from the normalization constant of the Beta function. Note that the right 
hand side of (|A.13I) is 1/2 w/zen m = 1 and Ofor m > 2 because for such m, 1 /T(2 — 
m) = rfwe to f/ze pole ofY{2 — m). 

Remark A.5. It is also possible to prove this lemma through symbolic manipulations. 
Using the results from [PSl lPSRl . one may input this into a Mathematica package, which 
outputs a proof. 

Appendix B. Moments for the family of all curves 
Birch [|Bil claims the following: Let 

/ 3 u\-\ 2R 

E 

_x mod p 



a mod p b mod p 



V 



(B.l) 



Then for p > 5, 



Slip) = P 2 
S 2 (p) = 2p 3 -3p 

S 3 (jp) = 5p 4 -9p 2 -5p. (B.l) 

There are obviously typos here. We know the Legendre sum is at most 2y/p in absolute 
value, thus we expect S R (p) to be on the order of p 2 ■ (y/p) 2R = p R+2 ; note the powers of 
p are too low (and they are too high for dividing Sr(p) by the cardinality of the family). 

Assuming Sr(p) is a polynomial in p, from exploring the results for small p we are 
led to 

Si(p) = P 3 -P 2 

S 2 (p) = 2p 4 - 2p 3 - 3p 2 + 3p 

S 3 (p) = 5p 5 - 5p 4 - 9p 3 + Ap 2 + Bp. (B.3) 

Note these are exactly the results from Birch multiplied by p — 1; we thank Andrew 
Granville for pointing this out to us. In other words, the formulas in Birch are what 
remains after dividing by the trivial multiplicative factor p — 1 . 
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Let S' R (p) denote the same sum as S R (p), but with the additional restriction that 
4a 3 7^ 27b 2 . It is readily seen that S' R (p) = S R (p) + (p — 1); the reason is that if 
the discriminant equals zero, then x 3 — ax — b = (x — c) 2 (x — d) for some c, d, and 
the sum of these Legendre symbols over all x modulo p is ±1 (the sum is the same as 
£,^ cm od P (^) = ~( S f) = ±D- Explicitly, we find 

Si(p) = p 3 -p 2 -p+l 

S 2 (j>) = 2p A - 2p 3 -3p 2 + 2p+l 

S 3 (p) = 5p 5 - 5p 4 - 9p 3 + Ap 2 + Ap + 1. (B.4) 

As the evaluation of these sums is central to this and other investigations, we provide 
two proofs of the formula for Si(p) in the hopes that these arguments will be of use to 
other researchers studying similar questions. 

We first give the proof in BMilll . We have the following expansion of (^) : 



x_ 

Pj " ±3 \PJ \P 



^e(-), (13.5) 



where eg) = exp(2nia/p) and G p = E a ( P ) (p ) e (p)> whicn e q uals VP for P = X ( 4 ) 
and iy/p for p = 3(4). See, for example, [BEW||. 

For the curve y 2 = = x 3 — ax — b, a E (p) = — J2 x (p) {^jr) • ^ e use ^.51) to 

rewrite a E (p) as 



x(p) c=l 



PJ \ p 



We take the complex conjugate, which on the RHS introduces a minus sign into the 
exponential and sends G p to G p , and has no effect on the LHS (which is real). The sum 
becomes 

a=Q 6=0 i=l Xi=0 Cj=0 



P J \ p 

- V V f 0lOz \ e ( c i x i ~ C 2%V \ y> ( -(cixi - c 2 x 2 )a \ 



xi,ci=0 a'2,C2=0 a=0 
P-1 



Ee(^*). (B.7) 



6=0 



P 



The 6-sum vanishes unless p\[c\ — c 2 ), which only happens if c\ = c 2 = c. The a- 
sum vanishes unless p\{cx\ — cx 2 ). As c ^ 0(p) (we have the factor (-)) this forces 

X\ = x 2 = x. As c is non-zero, (^) = 1, the first exponential factor is 1, and the sums 
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collapse to 



p—i p— 1 p— i p— i 



s = ^E^E' 



P 
1 



c=l a;=0 a=0 6=0 



V 

Remark B.l. We sketch an alternate proof for Si(R). We have 



= E E E E 

a mod p 6 mod p x mod p j/ mod p 



— ax — b \ ( y 3 — ay — b 



P 



ay 

P 



We use the following result: 

K 



E 

n mod p 

E 

n mod p 

E 

n mod p 



p 



P 



n 2 + n(c 2 - ci) 



p 



n 2 + an(c2 — c\) 



P 



for any a ^ mod p. Thus 



(p-m= E E 

a^O mod p n mod p 



n 2 + an(c2 — ci) 



p 



-(p-1), 



soTZ = — 1. TTzws 



E 

n mod p 



n + cA f n + c 2 



P 



P 



p — 1 z/ci = C2 mod p 
1 otherwise. 



We rewrite our sum ( replacing a with —a and 6 w/z7z —6) as 



= E E E 

a mod p x mod p ?/ mod p 



E 

b mod p 



6 + (x 3 + aa;)\ fb+ (y 3 + ay) 



P 



P 



(B.8) 



(B.9) 



(B.10) 



(B.ll) 



(B.12) 



(B.13) 



When is x 3 + ax = y 3 + ay mod p? Jfe is always true if x = y and a is arbitrary, 
which gives a contribution of p ■ p ■ (p — 1). If x ^ y (which happens p 2 — p times), 
there is a unique value of a that works, namely —(x 3 — y 3 )/(x — y). For this special 
a the contribution is {p 2 — p) • 1 • (p — 1), and for the other a the contribution is 
(p 2 — p) • (p — 1) • (—1). Adding yields p 3 — p 2 . 
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